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WHAT IS CLAIMED IS; 

1 . A method for matching patterns in a string of 
symbols comprising : 

identifying a first pattern of symbols to be matched, 
wherein the first pattern contains a prefix pattern, a value 
pattern and a suffix pattern; 

identifying candidate matches for the first pattern in 
the string, wherein each candidate match for the first pattern 
includes a candidate match for the prefix pattern, a candidate 
match for the suffix pattern and a candidate match for the 
value pattern; 

determining a cost associated with each of the candidate 
matches for the first pattern, wherein the cost associated 
with each of the candidate matches for the pattern includes a 
cost associated with the corresponding candidate match for the 
prefix pattern, a cost associated with the candidate match for 
the suffix pattern and a cost associated with the candidate 
match for the value pattern; and 

selecting one or more candidate matches for the pattern * 
that meet a cost selection criterion. 

1 2 . The method of claim 1 wherein determining a cost 

2 associated with each of the candidate matches comprises 

3 calculating a corresponding edit distance. 

1 3. The method of claim 1 wherein identifying the first 

2 pattern comprises providing a single example string wherein 

3 the first pattern is selected from the example string. 

1 4. The method of claim 1 further comprising examining 

2 the string to identify spans of interest, wherein each of the 

3 spans of interest meets a specified filtering criterion. 

1 5. The method of claim 4 wherein the specified 

2 filtering criterion comprises the inclusion of a keyword. 
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1 6. The method of claim 1 wherein selecting one or more 

2 candidate matches for the pattern that meet a cost selection 

3 criterion comprises selecting one or more candidate matches 

4 that have corresponding costs which fall below a selected 

5 threshold. 

1 7. The method of claim 1 wherein selecting one or more 

2 candidate matches for the pattern that meet a cost selection 

3 criterion comprises selecting a predetermined number of 

4 candidate matches that have the lowest corresponding costs. 

Q 

t ~jjjL 8. The method of claim 1 wherein selecting one or more 

H2 candidate matches for the pattern that meet a cost selection 

m 

fjB criterion comprises selecting a candidate match that has a 

7f$ lowest cost and selecting additional candidate matches that 

s 5 have corresponding costs which are within a predetermined 

O 

£j6 tolerance of the lowest cost . 

fU 

S\ 9. The method of claim 1 further comprising adjusting 

D 

|^2 the cost selection criterion and selecting one or more 

3 candidate matches for the pattern that meet the adjusted cost 

4 selection criterion. 

1 10. The method of claim 1 wherein the cost associated 

2 with the corresponding candidate match for the prefix pattern, 

3 and the cost associated with the candidate match for the 

4 suffix pattern are more heavily weighted than the cost 

5 associated with the candidate match for the value pattern. 

1 11. The method of claim 1 wherein the cost associated 

2 with each of the candidate matches for the first pattern is 

3 determined by adding the cost associated with the 

4 corresponding candidate match for the prefix pattern, the cost 

5 associated with the candidate match for the suffix pattern and 

6 the cost associated with the candidate match for the value 

7 pattern. 
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12. The method of claim 1 wherein identifying each 
candidate match for the first pattern comprises identifying 
the candidate match for the prefix pattern, wherein the 
candidate match for the prefix pattern defines a first end of 
a value window, then identifying a corresponding candidate 
match for the suffix pattern, wherein the candidate match for 
the suffix pattern defines a corresponding second end of the 
value window, wherein the candidate match for the value 
pattern comprises the symbols within the value window. 

13. The method of claim 1 further comprising filtering 
the candidate match for the value pattern using a keyword. 

14. The method of claim 1 further comprising filtering 
the candidate match for the value pattern using a regular 
expression . 

15. The method of claim 1 wherein identifying candidate 
matches for the prefix pattern comprises constructing an edit 
distance matrix for the prefix pattern and identifying one or 
more candidate matches for the prefix pattern, constructing an 
edit distance matrix for the suffix pattern and identifying 
one or more candidate matches for the suffix pattern, and 
identifying a candidate match for the value pattern between 
each pair of candidate prefix matches and candidate suffix 
matches . 

16. A computer readable medium containing instructions 
which are configured to implement the method comprising: 

identifying a first pattern of symbols to be matched, 
wherein the first pattern contains a prefix pattern, a value 
pattern and a suffix pattern; 

identifying candidate matches for the first pattern in 
the string, wherein each candidate match for the first pattern 
includes a candidate match for the prefix pattern, a candidate 
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9 match for the suffix pattern and a candidate match for the 

10 value pattern; 

11 determining a cost associated with each of the candidate 

12 matches for the first pattern, wherein the cost associated 

13 with each of the candidate matches for the pattern includes a 

14 cost associated with the corresponding candidate match for the 

15 prefix pattern, a cost associated with the candidate match for 

16 the suffix pattern and a cost associated with the candidate 

17 match for the value pattern; and 

IS selecting one or more candidate matches for the pattern 

15 that meet a cost selection criterion. 

y=|L 17. The computer readable medium of claim 16 wherein 

%2 determining a cost associated with each of the candidate 

UB matches comprises calculating a corresponding edit distance. 

Ml 18. The computer readable medium of claim 16 wherein 

0=2 identifying the first pattern comprises providing a single 

Q3 example string wherein the first pattern is selected from the 

" 4 example string. 

1 19. The computer readable medium of claim 16 further 

2 comprising examining the string to identify spans of interest, 

3 wherein each of the spans of interest meets a specified 

4 filtering criterion. 

1 20. The computer readable medium of claim 15 wherein the 

2 specified filtering criterion comprises the inclusion of a 

3 keyword. 

1 21. The computer readable medium of claim 16 wherein 

2 selecting one or more candidate matches for the pattern that 

3 meet a cost selection criterion comprises selecting one or 

4 more candidate matches that have corresponding costs which 

5 fall below a selected threshold. 
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1 22. The computer readable medium of claim 16 wherein 

2 selecting one or more candidate matches for the pattern that 

3 meet a cost selection criterion comprises selecting a 

4 predetermined number of candidate matches that have the lowest 

5 corresponding costs . 

1 23. The computer readable medium of claim 16 wherein 

2 selecting one or more candidate matches for the pattern that 

3 meet a cost selection criterion comprises selecting a 
a candidate match that has a lowest cost and selecting 

'^5 additional candidate matches that have corresponding costs 

f% which are within a predetermined tolerance of the lowest cost . 

fri 

QL 24. The computer readable medium of claim 16 further 

***2 comprising adjusting the cost selection criterion and 

E33 selecting one or more candidate matches for the pattern that 

pj4 meet the adjusted cost selection criterion. 

CP 

pi 25. The computer readable medium of claim 16 wherein the 

2 cost associated with the corresponding candidate match for the 

3 prefix pattern, and the cost associated with the candidate 

4 match for the suffix pattern are more heavily weighted than 

5 the cost associated with the candidate match for the value 

6 pattern. 

1 26. The computer readable medium of claim 16 wherein the 

2 cost associated with each of the candidate matches for the 

3 first pattern is determined by adding the cost associated with 

4 the corresponding candidate match for the prefix pattern, the 

5 cost associated with the candidate match for the suffix 

6 pattern and the cost associated with the candidate match for 

7 the value pattern. 

1 27. The computer readable medium of claim 16 wherein 

2 identifying each candidate match for the first pattern 

3 comprises identifying the candidate match for the prefix 
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4 pattern, wherein the candidate match for the prefix pattern 

5 defines a first end of a value window, then identifying a 

6 corresponding candidate match for the suffix pattern, wherein 

7 the candidate match for the suffix pattern defines a 

8 corresponding second end of the value window, wherein the 

9 candidate match for the value pattern comprises the symbols 
10 within the value window. 

1 28. The computer readable medium of claim 16 further 

2 comprising filtering the candidate match for the value pattern 
Q3 using a keyword. 

*8 

yX 29. The computer readable medium of claim 16 further 

Jp2 comprising filtering the candidate match for the value pattern 

□3 using a regular expression. 

E S 

—I 30. The computer readable medium of claim 16 wherein 

N2 identifying candidate matches for the prefix pattern comprises 

013 constructing an edit distance matrix for the prefix pattern 

r^4 and identifying one or more candidate matches for the prefix 

5 pattern, constructing an edit distance matrix for the suffix 

6 pattern and identifying one or more candidate matches for the 

7 suffix pattern, and identifying a candidate match for the 

8 value pattern between each pair of candidate prefix matches 

9 and candidate suffix matches. 



30 



013018 .00019 : 117937 . 01 



